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A new test of independence under neutrosophic statistics for testing the association between two criteria of classification is 
presented in this paper. The necessary contingency tables for the neutrosophic population and the neutrosophic sample are 
presented. The test statistic of the proposed test is introduced under neutrosophic statistics. A real example from education is 
selected to explain the proposed test. From the real example, it is concluded that the proposed test of independence is more 


informative, flexible, and suitable to be applied under uncertainty as compared to the existing test under classical statistics. 


1. Introduction 


One of the important uses of the chi-square distribution is in 
the test of independence. In this test, the association between 
two categorical variables is tested using the statistic with 
asymptotic chi-square distribution. The test of independence 
is applied to test the null hypothesis that classification 
according to two criteria is independent versus the alter- 
native hypothesis that classification according to two criteria 
is associated. The Pearson correlation is applied when the 
variables under study are quantities, while the test of in- 
dependence is applied to see the association between 
qualitative variables. McHugh [1] discussed the application 
of the chi-square test for medical science. Singhal and Rana 
[2] mentioned some applications of the chi-square test. 
Benhamou and Melot [3] presented the graphical inter- 
pretation of the test. Lin et al. [4] introduced bootstrap for 
the test and applied in the biopharmaceutical industry. 
Kroonenberg and Verbeek [5] discussed some technical 
issues of the contingency table. Dutton et al. [6, 7] discussed 
the application of the test in education. More applications of 
the test can be seen in [8-17]. 

The chi-square test of independence is designed under 
the assumption that all observed frequencies in the con- 
tingency table are in determined form. Nevertheless, in 
practice, the recorded data are not always precise and of 


determined form. For example, an education expert can 
make only the approximate estimation. In this situation, the 
test using the fuzzy approach is applied instead of the test 
under classical statistics. Runkler [18] presented the chi- 
square test using the fuzzy logic. Parthiban and Gajivar- 
adhan [19] discussed the application of the fuzzy-based chi- 
square test in the environment. Lin et al. [20] proposed this 
test using membership functions. Taheri et al. [21] worked 
on fuzzy-based contingency tables. Alevizos et al. [22] 
discussed the application for education interval data. Ane- 
zakis et al. [23] applied this fuzzy approach-based test for 
invasive species data. 

A neutrosophic logic which is the extension of the fuzzy 
logic is introduced by Smarandache [24]. The neutrosophic 
logic provides additional information that is called the 
measure of indeterminacy. This logic is more efficient than 
the fuzzy logic and interval-based analysis, see [25]. Several 
authors worked on neutrosophic and provided the appli- 
cations, see, for example, [26-36]. Smarandache [37] in- 
troduced the neutrosophic statistics that can be applied for 
data having indeterminate observations using the idea of the 
neutrosophic logic. The neutrosophic data are expressed in 
neutrosophic numbers Xy = X; + Xyly;Iy € Up Iyl 
where X; and XylIy are determinate and indeterminate 
parts, respectively. Note that I, € [I;,Iy] represents the 
indeterminacy interval. Suppose that I, € [0,2] and 


Xy =2+3Iy; the neutrosophic data will be in the form of 
[2, 8], for more details, the reader may refer to [38-40] which 
mentioned the applications of neutrosophic numbers. Aslam 
et al. [41-44] introduced some statistical tests under neu- 
trosophic statistics. 

A lot of literature studies on the chi-square test of 
independence using the classical statistics and fuzzy- 
based approach are available. The existing tests are unable 
to provide information about the measure of indeter- 
minacy when the data are obtained from the complex 
system. The existing test under classical statistics can be 
improved using the idea of neutrosophic statistics. By 
exploring the literature and to the best of our knowledge, 
there is no work on designing the chi-square test of in- 
dependence under neutrosophic statistics. In this paper, 
we will introduce the chi-square test of independence 
under indeterminacy. We will introduce the contingency 
tables and chi-square statistic under indeterminacy. The 
application of the proposed test will be given using real 
data from the education. We expect that the proposed test 
will be the best alternative of the existing tests under an 
uncertainty environment. The rest of the paper is orga- 
nized as follows: the proposed test of independence will be 
discussed in Section 2. The application and comparative 
studies will be discussed in Sections 3 and 4. A simulation 
study is given in Section 5, and some concluding remarks 
are given in the last section. 


2. Proposed Test of Independence 


One of the most important of the applications of the chi- 
square distribution is to test either the two criteria of 
classification are independent or not. As mentioned earlier, 
the existing test of independence can be applied only when 
the observations in the r x c contingency table are deter- 
mined, where r shows the rows and c shows the columns of 
the contingency table. In this section, we propose the test of 
independence under the neutrosophic statistics. The clas- 
sification according to two criteria of the neutrosophic 
population is given in Table 1. The classification according to 
two criteria of the neutrosophic sample is given in Table 2. 
The main aim of introducing the new test of independence 
under the neutrosophic statistic is to test the neutrosophic 
null hypothesis Hoy that, in the neutrosophic population, 
two criteria are independent versus the alternative hy- 
pothesis H,, that, in the neutrosophic population, two 
criteria are associated. 

The necessary steps to evaluate the proposed test of 
independence under the neutrosophic statistic are explained 
as follows: 


Step 1: state the null hypothesis Hoy that two criteria 
are independent versus the alternative hypothesis that 
two criteria are associated. 

Step 2: specify the level of significance a. 

Step 3: compute the neutrosophic values of the ex- 
pected frequency Ey = (((Oy - Ex)’ )/En). 

Step 4: the proposed statistic y4 € [77,77] is given by 
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2. z| (Oy - En)” 


Xn = È |: xn € [xxu] En € [Ev Ev]. 
N 
(1) 


The neutrosophic form of x3, € [x},xő] is given by 


ee? 
Xn =XitXoIns In € [nT]. (2) 
Note here that the proposed statistic yx, € [x}, x7] is the 
extension of the existing statistic y7 used for the test of 
independence. The proposed test reduces to test under 
classical statistics when I, =0. 


Step 5: select the critical value, say 2, from the chi- 
square table at the degree of freedom (r — 1) x (c — 1) 
at the level of significance a. 

Step 6: reject Hoy if xx € [x7>x@] is larger than the 
critical value. It means Hj, will be rejected if the values 
of both y7 and yj, are larger than y2. In some cases, it 
may happen that the critical value y? is between the 
value of yz and the value of yj. In this situation, 
according to Smarandache [37], Hoy will be rejected if 
Xt > X23 otherwise, do not reject it. 


3. Application of the Proposed Test 


In this section, we will discuss the application of the pro- 
posed test using real data from education. The education 
expert is interested to investigate the association between the 
“student’s group-community” and the “department.” The 
details about the real data can be seen in [45]. According to 
Alevizos et al. [22], “consider three groups of candidate- 
students who are preparing to enter in a university, where 
each group represents a different geographical community, 
indicated as Student-1, Student-2, and Student-3. In order to 
succeed in a university, the students should choose to 
participate at exactly one from a choice of three directions- 
departments of the university indicated as Mathematics, 
Physics, and Literature.” As mentioned earlier, an education 
expert is interested to test the null hypothesis that “student’s 
group-community” and “department” are independent 
versus the alternative hypothesis that “student’s group- 
community” and “department” are associated. The data of 
the two criteria are taken from [22] and reported in Table 3. 
From Table 3, it is quite clear that the observations of 
“student’s group-community” and “department” are pre- 
sented in the neutrosophic interval. Therefore, the use of the 
test of independence under classical statistics may mislead 
the education expert. In addition, the existing test is unable 
to provide the probability of indeterminacy associated with 
the test of independence. Therefore, the education expert is 
interested to use the proposed test for testing the association 
between “student’s group-community” and “department.” 
The proposed test for the data is stated as follows: 


Step 1: state the null hypothesis Hoy that “student’s 
group-community” and “department” are independent 
versus H,, that “student’s group-community” and 
“department” are associated. 

















































































































Complexity 3 
TABLE 1: Two-way classification of the neutrosophic population. 

24 criterion of 1* criterion of classification level 

classification 

level 1 2 3 T c Total 

1 Ny = Nur +t Nnuvânln; Nn =Ni t Nnyânly; Ni =Ni, + Niuânly; Ni. = Nier + NicuAnly; N 
Iy e ele Iy € [bed In € Ip Iy be fi € [els L 

2 Na = Nar + NoAntys Nz = Noor +NovAnty; Nz = Nz; + NzuAnly; Nz = Naer + NoeyAnIni N 
Iy € Up Iy Iy € Ip Iy Tere aly e Iy € [Ip Iy 2 

3 Nzi = Nap + NavAntys N32 = Naor + NavAnty;s N33 = N33, + Nasu Ant Nze = Nze + N3-yAnIn N. 
In € Ip Iy Iy € Up Iy Tse Use FĂ In € [Ip Iy 3. 

r Na = Nar + NyyAntns Np = Npr + NevAntIns N,3 = Npr + Nysy Ants Nye = Neer + Nyev Ant; N 
Iy € Ip Iy Iy € Up Iy Te {l Io Iy € [Ip Iy r 

Total N, N, N; PE N., Ny 

TABLE 2: Two-way classification of the neutrosophic sample. 

24 criterion of 1* criterion of classification level 

classification 

level 1 2 3 na c Total 

1 nii = nr + MAIN; niz = nzr + Moy Ayly; ni3 = ni3, + nigyAynly; nic = nicl + nicu Anly; i 
Iy € Ip Iy Iy € Ip Iy Iy € Ip Iy nS Iy € Ip Iy L 

2 mı =fr +My yAnINs "n = nyi + yyAnly; y = ngr + May Any; Mye = Mer + My Ayl y; n 
Pave tists Iy € Ip Iy Iy € lp Iy ote Iy € Ip Iy 2. 

3 3) = M31, +t mzyAynly; "3 = Mzz; + MzzyÁnly; "33 = mh3r + MzzyÁynly; nze = zer + My Anly; P 
Iye lliy Iy € Ip Iy Iy € Ip Iy ER Iy € Ip Iy 3. 

y na = nar +N yAnINs nn = npr + npyAnly; n,3 = ny3; + M3yAnIN3 Ape = nper + nry Ant; " 
Iy € lp Iy foe llna Ty € ilid Tee W To r. 

Total nı nN» n3 ete no ny 

TABLE 3: Observed frequency of the real data. 

Name Mathematics Physics Literature Total 

Student-1 6 + 10Iy3 Ly € [0, 0.4] 20 + 26Iy; Iy € [0,0.2308] 12 + 24Iy; Ty € [0,0.5] 38 + 60Iy; Ly € [0, 3.6667] 

Student-2 15+ 25Iy; Iy € [0,0.4] 30 + 50Iy; Iy € [0, 0.4] 17+ 17Iy; Iy € [0,0] 62 + 92Iy; Iy € [0, 0.3261] 

Student-3 1+ 3ly; Iy € [0, 0.6667] 5 + 5Iy; Lye[0, 0] 7 + 91y3 Iye[0, 0.2222] 13 + 171 y; Tye [0, 0.2353] 

Total 22 + 38Iy3 Iy € [0,0.4211] 554 811 y3 Iy € [0,0.3209] 36+ 50Iy; Iy € [0,0.28] 113 + 169I,y3 Iy € [0, 0.3314] 





Step 2: let æ = 0.05 for this test. 


Step 3: the neutrosophic values of the expected fre- 
quency Ey € [E,, Ey] are shown in Table 4. 


Step 4: the calculation of the statistic x% € [x7,xG] is 
shown as follows: 





2 
Á= z| (One) = [4.66, 13.43]; yy € [4.66, 13.43]. 


Ey 
(3) 


The neutrosophic form of x3, € [xż,x&] is given by 
Vy = 4.66 + 13.43; Iy € [0,0.65]. (4) 


Step 5: the critical value from the chi-square table at the 
degree of freedom 2 x 2 at a = 0.05 is 9.49. 


Step 6: according to Smarandache [37], the null hy- 
pothesis is rejected if XE > 9.49. Therefore, the null 
hypothesis that “student's group-community” and 
“department” are independent is rejected. 


4. Comparative Study Based on Real Data 


As mentioned earlier, the proposed test of independence is 
the generalization of the existing test under classical sta- 
tistics. For the comparison, the same level of all parameters is 
used for both tests. First, we compare both tests in terms of 
values of test statistics. Then, we will compare both tests in 
terms of probabilities. 

The neutrosophic form of the statistic y4 € [y7,x7] is 
yi, = 4.66 + 13.43; Iy € [0,0.65]. Note here that the pro- 
posed neutrosophic form reduces to the statistic under 




















4 Complexity 
TABLE 4: Expected frequency of the real data. 

Name Mathematics Physics Literature 

Student-1 7.40 + 13.491 y3 Iy € [0, 0.4514] 18.49 + 28.76Iy; Iy € [0, 0.3571] 12.11 + 17.75Iy3 Iy € [0, 0.3177] 

Student-2 12.07 + 20.68Iy; In € [0, 0.4163] 30.18 + 44.09Iy; Iy € [0, 0.3155] 19.75 + 27.22Iy; Iy € [0, 0.2744] 

Student-3 2.53 + 3.82Iy; Ly € [0, 0.3377] 6.33 + 8.151 y3 Iy € [0, 0.2233] 4.14 + 5.03Iy3 Iy € [0, 0.1770] 

TABLE 5: The power of the test. 

df f Power of test Power of existing test 

1 [0.04, 0.01] 0.96, 0.97 0.95 

2 [0.06, 0.05] 0.94, 0.95 0.94 

3 [0.02, 0.08] 0.91, 0.94 0.9 

4 [0.02, 0.04] 0.92, 0.96 0.92 

5 [0.09, 0.06] 0.91, 0.94 0.9 

6 [0, 0.07] 0.96, 0.98 0.94 

7 [0.04, 0.04] 0.94, 0.95 0.93 

8 [0.09, 0.07] 0.91, 0.93] 0.9 

9 [0.04, 0.06] (0.9, 0.94] 0.9 

10 (0.07, 0.09] (0.93, 0.96 0.91 

20 (0.04, 0.01 0.94, 0.96] 0.92 

30 [0.05, 0.04 [0.95, 0.96 0.93 

40 [0.03, 0.02 [0.97, 0.98 0.95 

50 [0.05, 0.04 [0.95, 0.96 0.94 
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Ficure 1: The power curves of the proposed and existing tests. 


classical statistics when I; = 0. Therefore, the value of X% = 
4.66 is corresponding to the statistic under classical statistics. 
From this result, it can be seen that the proposed test statistic 
values lie in the indeterminacy interval Xa € [4.66, 13.43]. 
On the contrary, the existing test under classical statistics 
provides only the determined value of the test statistic which 
is not suitable in the indeterminacy environment. From this 
comparison, it is concluded that the proposed test is quite 
affective, adequate, and effective to be applied in uncertainty 
as compared to the test of independence under classical 
statistics. 


Now, we will compare both tests in terms of probabilities 
related to the null hypothesis. For the real data, the level of 
significance is 5%. The existing test tells us that the chance of 
accepting the null hypothesis Hoy is 0.95, and the proba- 
bility of rejecting the null hypothesis is 0.05. On the contrary, 
the proposed test tells that the chance of accepting the null 
hypothesis Hoy is 0.95, the probability of rejecting the null 
hypothesis is 0.05, and the probability of indeterminacy is 
0.65. From this comparison, it can be concluded that the 
proposed test provides the additional probability of inde- 
terminacy that cannot be obtained from the existing test 
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under classical statistics. Therefore, the proposed test is 
efficient than the existing test in providing information 
about the probabilities associated with the test of 
independence. 


5. Simulation Study 


Now, we discuss the importance of the proposed test in 
terms of the power of the test using simulation data. We will 
compare the proposed test and the existing test in terms of 
the power of the test. Let 6 denote the type-II error which is 
defined as the probability of rejecting Hy, when Hoy is true. 
On the contrary, the power of the test (1 — £) is defined as 
the probability of rejecting Hy, when Hoy is false. The 
values of f and (1 — f) for both tests are shown in Table 5. 
The following algorithm is used to construct Table 1: 


Step 1: for the r — 1 xc — 1 contingency table, specify 
df. 

Step 2: compute the values of the statistic yx, € [y7,x7] 
and compare it with y2. 


Step 3: compute f by dividing the total number of 
accepting Hoy to the total number of replications, say 
100. 


From Table 1, it is clear that, for df = 1 and df = 2, the 
power of the existing test is between the indeterminacy of the 
proposed test. For df >2, the power of the existing test is 
decreasing as d f is increased. The power curves of both tests 
are shown in Figure 1. Figure 1 indicates that the proposed 
test has higher values of the power of the test as compared to 
the existing test. From this study, it can be concluded that the 
proposed test is better than the existing test in terms of the 
power of the test. 


6. Concluding Remarks 


A new test of independence under neutrosophic statistics for 
testing the association between two criteria of classification 
is presented in this paper. The necessary contingency tables 
for the neutrosophic population and the neutrosophic 
sample were presented. The test statistic of the proposed test 
is introduced under neutrosophic statistics. The application 
of the proposed test was given using the data from the 
education department. The comparative study shows the 
efficiency of the proposed test over the existing test. We 
recommend applying the proposed test for association 
testing in the presence of indeterminacy. The proposed test 
can be applied for big data as future research. The devel- 
opment of software for the proposed test is also a fruitful 
area of research. 
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